Utility of adding Radiomics to clinical features in predicting the outcomes of radiotherapy for head and neck cancer using machine learning

Background Radiomics involves the extraction of quantitative information from annotated Computed-Tomography (CT) images, and has been used to predict outcomes in Head and Neck Squamous Cell Carcinoma (HNSCC). Subjecting combined Radiomics and Clinical features to Machine Learning (ML) could offer better predictions of clinical outcomes. This study is a comparative performance analysis of ML models with Clinical, Radiomics, and Clinico-Radiomic datasets for predicting four outcomes of HNSCC treated with Curative Radiation Therapy (RT): Distant Metastases, Locoregional Recurrence, New Primary, and Residual Disease. Methodology The study used retrospective data of 311 HNSCC patients treated with radiotherapy between 2013–2018 at our centre. Binary prediction models were developed for the four outcomes with Clinical-only, Clinico-Radiomic, and Radiomics-only datasets, using three different ML classification algorithms namely, Random Forest (RF), Kernel Support Vector Machine (KSVM), and XGBoost. The best-performing ML algorithms of the three dataset groups was then compared. Results The Clinico-Radiomic dataset using KSVM classifier provided the best prediction. Predicted mean testing accuracy for Distant Metastases, Locoregional Recurrence, New Primary, and Residual Disease was 97%, 72%, 99%, and 96%, respectively. The mean area under the receiver operating curve (AUC) was calculated and displayed for all the models using three dataset groups. Conclusion Clinico-Radiomic dataset improved the predictive ability of ML models over clinical features alone, while models built using Radiomics performed poorly. Radiomics data could therefore effectively supplement clinical data in predicting outcomes.


Introduction
Background 2 Scientific background and explanation of rationale Theories used in designing behavioral interventions

Methods
Participants 3 Eligibility criteria for participants, including criteria at different levels in recruitment/sampling plan (e.g., cities, clinics, subjects) Method of recruitment (e.g., referral, self-selection), including the sampling method if a systematic sampling plan was implemented Recruitment setting Settings and locations where the data were collected Interventions 4 Details of the interventions intended for each study condition and how and when they were actually administered, specifically including: Unit of assignment (the unit being assigned to study condition, e.g., individual, group, community) Method used to assign units to study conditions, including details of any restriction (e.g., blocking, stratification, minimization) Inclusion of aspects employed to help minimize potential bias induced due to non-randomization (e.g., matching) Whether or not participants, those administering the interventions, and those assessing the outcomes were blinded to study condition assignment; if so, statement regarding how the blinding was accomplished and how it was assessed.
Unit of Analysis 10 Description of the smallest unit that is being analyzed to assess intervention effects (e.g., individual, group, or community) If the unit of analysis differs from the unit of assignment, the analytical method used to account for this (e.g., adjusting the standard error estimates by the design effect or using multilevel analysis) Statistical Methods

11
Statistical methods used to compare study groups for primary methods outcome(s), including complex methods of correlated data Statistical methods used for additional analyses, such as a subgroup analyses and adjusted analysis Methods for imputing missing data, if used Statistical software or programs used

Participant flow 12
Flow of participants through each stage of the study: enrollment, assignment, allocation, and intervention exposure, follow-up, analysis (a diagram is strongly recommended) o Enrollment: the numbers of participants screened for eligibility, found to be eligible or not eligible, declined to be enrolled, and enrolled in the study o Assignment: the numbers of participants assigned to a study condition o Allocation and intervention exposure: the number of participants assigned to each study condition and the number of participants who received each intervention o Follow-up: the number of participants who completed the followup or did not complete the follow-up (i.e., lost to follow-up), by study condition o Analysis: the number of participants included in or excluded from the main analysis, by study condition Description of protocol deviations from study as planned, along with reasons Recruitment 13 Dates defining the periods of recruitment and follow-up Baseline Data 14 Baseline demographic and clinical characteristics of participants in each study condition Baseline characteristics for each study condition relevant to specific disease prevention research Baseline comparisons of those lost to follow-up and those retained, overall and by study condition Comparison between study population at baseline and target population of interest Baseline equivalence 15 Data on study group equivalence at baseline and statistical methods used to control for baseline differences Number of participants (denominator) included in each analysis for each study condition, particularly when the denominators change for different outcomes; statement of the results in absolute numbers when feasible Indication of whether the analysis strategy was "intention to treat" or, if not, description of how non-compliers were treated in the analyses Outcomes and estimation 17 For each primary and secondary outcome, a summary of results for each estimation study condition, and the estimated effect size and a confidence interval to indicate the precision Inclusion of null and negative findings Inclusion of results from testing pre-specified causal pathways through which the intervention was intended to operate, if any Ancillary analyses 18 Summary of other analyses performed, including subgroup or restricted analyses, indicating which are pre-specified or exploratory Adverse events 19 Summary of all important adverse events or unintended effects in each study condition (including summary measures, effect size estimates, and confidence intervals)

Interpretation 20
Interpretation of the results, taking into account study hypotheses, sources of potential bias, imprecision of measures, multiplicative analyses, and other limitations or weaknesses of the study Discussion of results taking into account the mechanism by which the intervention was intended to work (causal pathways) or alternative mechanisms or explanations Discussion of the success of and barriers to implementing the intervention, fidelity of implementation Discussion of research, programmatic, or policy implications Generalizability 21 Generalizability (external validity) of the trial findings, taking into account the study population, the characteristics of the intervention, length of follow-up, incentives, compliance rates, specific sites/settings involved in the study, and other contextual issues Overall Evidence

22
General interpretation of the results in the context of current evidence and current theory